Graph serves as a powerful tool for modeling data that has an underlying structure in non-Euclidean space, by encoding relations as edges and entities as nodes. Despite developments in learning from graph-structured data over the years, one obstacle persists: graph imbalance. Although several attempts have been made to target this problem, they are limited to considering only class-level imbalance. In this work, we argue that for graphs, the imbalance is likely to exist at the sub-class topology group level. Due to the flexibility of topology structures, graphs could be highly diverse, and learning a generalizable classification boundary would be difficult. Therefore, several majority topology groups may dominate the learning process, rendering others under-represented. To address this problem, we propose a new framework {\method} and design (1 a topology extractor, which automatically identifies the topology group for each instance with explicit memory cells, (2 a training modulator, which modulates the learning process of the target GNN model to prevent the case of topology-group-wise under-representation. {\method} can be used as a key component in GNN models to improve their performances under the data imbalance setting. Analyses on both topology-level imbalance and the proposed {\method} are provided theoretically, and we empirically verify its effectiveness with both node-level and graph-level classification as the target tasks.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Most existing distillation methods ignore the flexible role of the temperature in the loss function and fix it as a hyper-parameter that can be decided by an inefficient grid search. In general, the temperature controls the discrepancy between two distributions and can faithfully determine the difficulty level of the distillation task. Keeping a constant temperature, i.e., a fixed level of task difficulty, is usually sub-optimal for a growing student during its progressive learning stages. In this paper, we propose a simple curriculum-based technique, termed Curriculum Temperature for Knowledge Distillation (CTKD), which controls the task difficulty level during the student's learning career through a dynamic and learnable temperature. Specifically, following an easy-to-hard curriculum, we gradually increase the distillation loss w.r.t. the temperature, leading to increased distillation difficulty in an adversarial manner. As an easy-to-use plug-in technique, CTKD can be seamlessly integrated into existing knowledge distillation frameworks and brings general improvements at a negligible additional computation cost. Extensive experiments on CIFAR-100, ImageNet-2012, and MS-COCO demonstrate the effectiveness of our method. Our code is available at https://github.com/zhengli97/CTKD.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
Among current anchor-based detectors, a positive anchor box will be intuitively assigned to the object that overlaps it the most. The assigned label to each anchor will directly determine the optimization direction of the corresponding prediction box, including the direction of box regression and category prediction. In our practice of crowded object detection, however, the results show that a positive anchor does not always regress toward the object that overlaps it the most when multiple objects overlap. We name it anchor drift. The anchor drift reflects that the anchor-object matching relation, which is determined by the degree of overlap between anchors and objects, is not always optimal. Conflicts between the fixed matching relation and learned experience in the past training process may cause ambiguous predictions and thus raise the false-positive rate. In this paper, a simple but efficient adaptive two-stage anchor assignment (TSAA) method is proposed. It utilizes the final prediction boxes rather than the fixed anchors to calculate the overlap degree with objects to determine which object to regress for each anchor. The participation of the prediction box makes the anchor-object assignment mechanism adaptive. Extensive experiments are conducted on three classic detectors RetinaNet, Faster-RCNN and YOLOv3 on CrowdHuman and COCO to evaluate the effectiveness of TSAA. The results show that TSAA can significantly improve the detectors' performance without additional computational costs or network structure changes.
translated by 谷歌翻译
Personalized Federated Learning (PFL) which collaboratively trains a federated model while considering local clients under privacy constraints has attracted much attention. Despite its popularity, it has been observed that existing PFL approaches result in sub-optimal solutions when the joint distribution among local clients diverges. To address this issue, we present Federated Modular Network (FedMN), a novel PFL approach that adaptively selects sub-modules from a module pool to assemble heterogeneous neural architectures for different clients. FedMN adopts a light-weighted routing hypernetwork to model the joint distribution on each client and produce the personalized selection of the module blocks for each client. To reduce the communication burden in existing FL, we develop an efficient way to interact between the clients and the server. We conduct extensive experiments on the real-world test beds and the results show both the effectiveness and efficiency of the proposed FedMN over the baselines.
translated by 谷歌翻译
生长免费的在线3D形状集合决定了3D检索的研究。然而,已经进行了积极的辩论(i)最佳输入方式是触发检索,以及(ii)这种检索的最终用法场景。在本文中,我们为回答这些问题提供了不同的观点 - 我们研究了3D草图作为输入方式,并提倡进行检索的VR-Scenario。因此,最终的愿景是用户可以通过在VR环境中自由空气供电来自由地检索3D模型。作为新的3D VR-Sketch的首次刺入3D形状检索问题,我们做出了四个贡献。首先,我们对VR实用程序进行编码以收集3D VR-Sketches并进行检索。其次,我们从ModelNet收集了两个形状类别的第一套$ 167 $ 3D VR-SKETCHES。第三,我们提出了一种新的方法,以生成不同抽象级别类似人类的3D草图的合成数据集,以训练深层网络。最后,我们比较了常见的多视图和体积方法:我们表明,与3D形状到3D形状检索相比,基于体积点的方法在3D草图上表现出卓越的性能,并且由于稀疏和抽象的性质而显示出3D形状的检索3D VR-Sketches。我们认为,这些贡献将集体成为未来在此问题的尝试的推动者。 VR接口,代码和数据集可在https://tinyurl.com/3dsketch3dv上找到。
translated by 谷歌翻译
我们介绍了1,497个3D VR草图和具有较大形状多样性的椅子类别的3D形状对的第一个细粒数据集。我们的数据集支持草图社区的最新趋势,以细粒度的数据分析,并将其扩展到主动开发的3D域。我们争辩说最方便的草图场景,其中草图由稀疏的线条组成,并且不需要任何草图技能,事先培训或耗时的准确绘图。然后,我们首次将细粒度3D VR草图的场景研究为3D形状检索,作为一种新颖的VR素描应用程序和一个探索基础,以推动通用见解以告知未来的研究。通过实验在这个新问题上精心选择的设计因素组合,我们得出重要的结论以帮助跟进工作。我们希望我们的数据集能够启用其他新颖的应用程序,尤其是那些需要细粒角的应用程序,例如细粒度的3D形状重建。该数据集可在tinyurl.com/vrsketch3dv21上获得。
translated by 谷歌翻译
我们研究基于3D-VR-Sketch的细粒度3D形状检索的实际任务。此任务特别令人感兴趣,因为2D草图被证明是2D图像的有效查询。但是,由于域间隙,很难从2D草图中以3D形状的检索获得强劲的性能。最近的工作证明了3D VR素描在此任务上的优势。在我们的工作中,我们专注于3D VR草图中固有的不准确性造成的挑战。我们观察到,带有固定边缘值的三胞胎损失获得的检索结果,通常用于检索任务,包含许多无关的形状,通常只有一个或几个或几个具有与查询相似的结构。为了减轻此问题,我们首次在自适应边距值和形状相似性之间建立联系。特别是,我们建议使用由“拟合差距”驱动的自适应边距值的三重损失,这是在结构保护变形下的两个形状的相似性。我们还进行了一项用户研究,该研究确认这种拟合差距确实是评估形状结构相似性的合适标准。此外,我们介绍了202个VR草图的数据集,用于从内存而不是观察到的202个3D形状。代码和数据可在https://github.com/rowl1ng/structure-aware-aware-vr-sketch-shape-retrieval中找到。
translated by 谷歌翻译
尖峰神经网络(SNN)是第三代人工神经网络,可以在神经形态硬件上实施节能。但是,尖峰的离散传播给坚固且高性能的学习机制带来了重大挑战。大多数现有的作品仅着眼于神经元之间的学习,但忽略了突触之间的影响,从而导致稳健性和准确性丧失。为了解决这个问题,我们通过对突触(APB)(APB)之间的关联可塑性(APB)进行建模,从而提出了一种强大而有效的学习机制。使用提出的APB方法,当其他神经元同时刺激时,同一神经元的突触通过共享因素相互作用。此外,我们提出了一种时空种植和翻转(STCF)方法,以提高网络的概括能力。广泛的实验表明,我们的方法在静态CIFAR-10数据集和神经形态MNIST-DV的最新性能上实现了卓越的性能,通过轻量级卷积网络,CIFAR10-DVS数据集。据我们所知,这是第一次探索突触之间的学习方法和神经形态数据的扩展方法。
translated by 谷歌翻译